Preserving sets of information in rollup tables

ABSTRACT

Techniques for making aggregated entries in a database table which aggregate information from other entries in tables in the database system. The techniques permit the aggregated entries to contain not only metric values aggregated from the other entries by techniques such as averaging in which the individual values are lost, but also sets of individual values from the other entries. One area of application for the techniques is the roll up tables used in the management systems for database management systems to reduce the size of historic information about events that have occurred in the database management system. Each roll up entry in a roll up table is an aggregated entry that contains information about some number of events. A roll up entry that uses the techniques contains a representation of a set whose values are the occurrence times of the events that are represented by the rollup record. Among the techniques that can be used to represent the set of occurrence times are a comma list of the occurrence times and a bit map which has a bit for each second in a day. Roll up entries that contain such representations of sets of occurrence times may be analyzed to determine whether occurrences of events are related, and if they are, the fact of the relationship can be used to design filters that can be applied in the roll up process, in error reporting, and in the analysis of the roll up tables.

CROSS REFERENCE TO RELATED APPLICATIONS BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates generally to reducing the size of stored data andmore specifically to reducing the size of information used to manage adata processing system.

2. Description of Related Art: FIGS. 1 and 2

Nothing is easier in a modem data processing system than storing data,and nothing is cheaper than the storage media needed to store it. Yetone of the chief consequences of this happy situation is that systemmanagers are forever short of storage for their data. No one likes toactually throw data out, so many techniques have been developed toreduce the size of the data. Size reduction techniques fall into twoclasses: lossless techniques, in which the size of the data is reducedbut no information is lost, and lossy techniques, in which the size ofthe data is reduced and some of the information is lost, but theinteresting information is saved. Both techniques have their advantagesand disadvantages. Lossless techniques save all the information butreduce its size by encoding the information. The need to preserve allthe information limits the amount of reduction that can be achieved andthe need first to encode the information, then to decode the informationwhenever it is read and then to decode and encode it whenever it isaltered greatly increases the overhead of working with such information.Far greater size reductions are possible with lossy techniques than withlossless techniques and encoding can often be avoided, but lossytechniques are based on saving interesting information and thereforetend to be application-dependent, since it is the application thatdefines what is interesting.

An example of lossy size reduction techniques is the roll up tables thatare used in database management systems to reduce the amount of systemmanagement information that must be stored. By reducing the amount ofinformation that must be stored, roll up tables also make management ofthe information easier. For example, queries execute much faster on theroll up tables than they would have on the equivalent non-rolled uptables. FIG. 1 is an overview of a modem DBMS system management system101, the Oracle® Enterprise Manager, manufactured by Oracle Corporation,Redwood City, Calif. System management system 101 includes a managementservice 113 that resides on a management server 111, a computer systemthat includes an Oracle DBMS and network connections to a group ofmonitored targets 103(i . . . n). Typically, monitored targets 103 areOracle DBMS systems and Oracle servers which receive requests forinformation contained in the Oracle DBMSs from users of the World WideWeb. Each monitored target 103(i) includes a management agent 105 whichis an agent for management service 113. Agent 105 continually monitorsits target 103(i) as specified by commands that agent 105 receives frommanagement service 113 and sends messages about the results of itsmonitoring to management service 113. The occurrences in target 103(i)that agent 105 monitors are termed in the following events. An event maybe a hit on a Web page that is stored on target 013(i), it may be asystem condition that has crossed a threshold, such as the amount ofdisk space available, or any other occurrence that is of interest tomanagement service 113. Commands from service 113 and messages fromagent 105 are communicated using Internet protocols across a networkconnecting management service 113 and target 103(i). Management service113 may respond to a message by storing its contents in tables inmanagement repository 115, an Oracle DBMS. Management service 113 may ofcourse also take immediate action in response to a message. Theimmediate action may be an automatic response to the message or it maybe providing a message to central console 121, which a databaseadministrator (DBA) uses to communicate with management service 113. TheDBA may then use console 121 to enter commands to management service 113to deal with the situation noted by management service 113. The DBA mayalso use console 121 to investigate the current and historical state ofthe monitored targets 103 and to reconfigure the monitored targets.

If there is any large number of monitored targets 103, the agents returnan enormous amount of information to management service 113 and most ofthe returned information ends up in management repository 115.Consequently, management repository 115 quickly fills up. To reduce theamount of space required by the returned information in repository 115,management service 103 periodically aggregates the older returnedinformation to reduce its size. To aggregate the information, managementservice 103 rolls up the older returned information to produce rolled upinformation which is much smaller than the information it was made fromand then replaces the older information with the rolled up information.Thus, in FIG. 1, management repository 115 includes current non-rolledup information 117 and less current rolled up information 119.

Since the rolled up information is on the one hand historical but on theother hand needs to be easily accessible to central console 121,management service 103 uses lossy techniques to do the roll up. FIG. 2gives a simplified example. The events being monitored in the exampleare hits on Web pages. For purposes of the example, managementrepository 115 is taken to include a page hit table 101 which hasentries that record accesses by users on the World Wide Web to Web pagesprovided by monitored targets 103. There is a hit entry 109 for each hiton a page in a monitored target 103. Each entry includes three items ofdata: the URL (universal resource locator) for the page, a time/datestamp 105 which indicates when the hit occurred, and the source Internetaddress 107 of the entity that made the hit.

Clearly, page hit table 101 will grow very rapidly. Management serviceconsequently periodically rolls up table 101 to produce a page hit rollup table 111 for a period X. Roll up table 111 contains only twocolumns: one, 113, for page URLs and one, 115, that indicates the numberof hits received on the page during the period X. There is only oneentry for each of the page URLs in table 111, and the value of field 115for the entry is the number of hits experienced by the page in theperiod X As will be immediately apparent from the foregoing, managementservice 113 makes table 111 from table 101 by making a single entry 117in table 111 for each of the URLs that is present in table 101, countingthe number of entries for each URL in table 101 for the period X, andplacing the result of the count in no of hits field 115. As will also beimmediately apparent, table 111 is far, far smaller than table 101. Inthe following, tables like page hit roll up table 111 will be calledaggregation tables and their entries aggregated entries, since eachentry in table 111 may aggregate information from many entries in table101. Further, values such as number of hits 115 which are made bycombining a set of values such that the individual values in the set arelost will be termed herein metric values. Other examples of such metricvalues are averages, maxima, minima, modes, and medians. The meaning ofa metric value of course depends on the kind of event. For example, ifthe event indicates that a condition to which the DBA must respond hasarisen, the metric value may indicate the time between the time at whichthe event occurred and the time at which the DBA responded, and theaggregated metric value may be the average response time.

Of course, table 111 may itself be rolled up. For example, if the periodX is one day and there is thus a roll up table 111 for each day, aweekly roll up table may be made from seven daily tables 111. Again,there would be one entry for each URL upon which there was a hit duringthe week, and no. of hits 115 would contain the number of hits for theweek. The week tables may be rolled up into month tables, the monthtables into year tables, and so on. The creation of any roll up entrymay be regarded as a roll up event at a roll up level n, with the entrycreated by the roll up being a roll up event entry for level n and theroll up at level n+1 rolling up the roll up event entries for level n.

Aggregation tables are challenging to design. The challenge is to reducethe size of the information in the aggregation table as much as possiblewhile reducing the usefulness of the information contained in the tableas little as possible. Table 111 illustrates the difficulty. In page hittable 101, the time at which each hit occurred is recorded in time/datefield 105; this information is lost in table 111; thus, though table 111can tell the DBA how many times a page was hit in the period X, itcannot tell the DBA anything about the temporal distribution of hitsover the period X This pattern information may, however, be exactly whatthe DBA needs to correctly distribute copies of the page among monitoredtargets 103. What is needed if rollup table 111 is to provide usefulinformation about the temporal distribution of the hits is a way ofrepresenting time/date information 105 in aggregated page hit entry 117for the individual hit entries 109 that have been aggregated into entry117. In more general terms, the problem is this: how to incorporateinformation that consists of sets of values into aggregation tables.What is needed, and what is provided by the invention disclosed herein,is a technique for doing this.

SUMMARY OF THE INVENTION

In one aspect, the technique for incorporating information that consistsof sets of values into an aggregation table is a method of aggregating aplurality of entries in table in a database management system into anaggregated entry. The method includes the step of making an aggregatedentry that represents a plurality of the table entries and that includesa field whose value is a representation of a set that may have aplurality of members and the step of deriving members of the set fromvalues contained in the plurality of table entries represented by theaggregated entry.

Further refinements of the technique include deleting the plurality ofentries which the aggregated entry represents when the aggregated entryhas been finished, using a representation of the set which varies withthe number of members in the set, representing the set as a characterstring wherein each member of the set is represented by a sequence ofcharacters and the sequences of characters are separated by a separatorcharacter, using a representation of the set which has a size that isconstant regardless of the number of members in the set, and in such arepresentation, representing the set as a string of elements, with therebeing an element corresponding to each potential member of the set andthe presence of a particular member in the set being indicated by afirst value of the corresponding element and the member's absence by asecond value of the corresponding element. In one application of thetechnique, the values of the members of the set are time values; inanother, they are location values.

In another aspect, the technique for incorporating information thatconsists of sets of values into an aggregation table is a method ofrolling up event information. The method is practiced in a managementsystem for a database management system. The event information iscontained in event entries in a table in the database management systemand includes a time of occurrence for each event. The method includesthe step of making a roll up entry that represents a plurality of theevent entries and includes a representation of a set whose members aretimes of occurrences and the step of deriving the members of the setfrom the times of occurrences in the plurality of event entries.

Further refinements of the technique include:

-   -   the step of aggregating metric values in the plurality of event        entries to produce an aggregated metric value in the roll up        entry;    -   the step of deleting the plurality of event entries represented        by the roll up entry;    -   representing the set as a character string wherein each time of        occurrence is represented by a sequence of characters and the        sequences of characters are separated by a first separator        character;    -   including the period of time during which the times of        occurrences in the entries represented by the roll up entry        occurred in the roll up entry;    -   a including the number of events represented by the roll up        entry in the roll up entry; and    -   using digests in the roll up records to represent fields that        have the same value in every one of the records represented by        the roll up record.

In another aspect, the sets of occurrence times in roll up table entriesmay be used to detect relationships between events. Where there is arelationship between events, there should also be a relationship betweenthe times of occurrence of the events and the temporal relationship canbe detected by comparing sets of occurrence times.

Other objects and advantages will be apparent to those skilled in thearts to which the invention pertains upon perusal of the followingDetailed Description and drawing, wherein:

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is an overview of a management system 101 for a data processingsystem;

FIG. 2 is an example of a roll up table;

FIG. 3 shows a version of the table of FIG. 2 which has been modified sothat the aggregation records represent a set of time values;

FIG. 4 shows a table containing information upon which an alert may bebased; and

FIG. 5 shows an example alert history table that includes entries forvarious roll up intervals.

Reference numbers in the drawing have three or more digits: the tworight-hand digits are reference numbers in the drawing indicated by theremaining digits. Thus, an item with the reference number 203 firstappears as item 203 in FIG. 2.

DETAILED DESCRIPTION

The following Detailed Description will first show how the roll up tableof FIG. 2 may be modified to represent a set of values specifying whenthe hits on the page URL occurred, will then discuss techniques forrepresenting sets of values, and will then describe a table which rollsup event information including when the event occurred, as well as usesfor such a table.

Modifying Page Hit Rollup Table 111 to Specify the Times that the HitsOccurred: FIG. 3

FIG. 3 shows a version 301 of page hit rollup table 111 which has beenmodified to include a representation of the time/date informationcontained in field 105 of page hit table 101. The representation makesup column 303 of table 301. The data in fields belonging to column 303represents the time/date information for all of the records from table101 that are aggregated into an entry in table 301 as a set of the timesat which the hits occurred. Thus, for the record that aggregates thehits on page URL A, the set value is the set {x}; for the record thataggregates the hits on page URL B, the set value is the set {y}. In bothcases, the time at which each hit on the record's URL occurred is amember of the set represented by the set value. The set value is madewhen the records aggregated into the entry in page hit roll up table 301are aggregated; each time one of the records being aggregated is read,its time/date stamp 105 is made into a member of the set represented byfield 303 in the aggregated record. Of course, set values in a rolluprecord may be further rolled up. For example, the set of timesrepresenting hits during a day may be rolled up into a set of timesrepresenting hits during a week. Further, the granularity of the set maybe reduced in such a rollup. For instance, the weekly rollup may includea set value in which members of the set specify the number of hits perhour for each hour of the week.

There are many ways of representing the set of values; fundamentally,any technique which can be used to represent a list of values can beused to represent a set of values. Moreover, at least some advantagewill be gained by any representation of the set of values in which therepresentation of a particular value is smaller than the representationused for the value in the non-rolled up entry. Which technique is chosento represent the set of values depends on the storage space/processingtime tradeoffs for a particular application and on the sparseness of theinformation, that is the relationship of the number of values that a setwill actually have to the number of values the set can potentially have.For example, if the time of a hit is specified in seconds, there are86400 seconds in a 24 hour period, so set of hit times 303 in a dayrollup table could potentially have that many values. However, if a pageis known to typically receive 100 hits a day, the set will typicallyhave only 100 of the possible 86400 values. Such a set is termed hereina sparse set of time values. Conversely, a set in which the number ofvalues in the set approaches the maximum possible is termed a dense set.

Two techniques for representing time values are shown in FIG. 3. At 305is a comma list set data item and at 307 is a bit set data item. Commalist set data item 305 is a varying-length character string in which thetime of each hit is represented by an integer indicating a second. Theintegers on the list are separated by a separator character, in thiscase, the comma. Thus, if page URL A has hits at 5 seconds aftermidnight, 70 seconds after midnight, and so on, the set of hits can berepresented by the character string 5,70, and so on for each hit. Thestring shown at 305 includes a cluster of hits around 8:00 AM; these arerepresented by the sequence of numbers and commas 28795, 28796, 28,797,28801, 28805, 28806,. The comma list is well suited for sparse sets;however, each numeral and comma requires 8 bits of storage, and thestorage requirements for comma list representations of dense sets becomequite large. One way of dealing with this problem is to give the commalist set data item a maximum size which will accommodate a highpercentage of the sets and provide an overflow table which stores setmembers that cannot be accommodated in the comma list set data item. Anentry in the overflow table would be accessed by the same key used toaccess the rollup table entry.

In a bit set data item 307, each possible member of the set is mapped toa single bit in a bit string. Thus, there are 86400 seconds in 24 hoursand bit set data item 307 has 86400 bits, numbered 0 . . . 86399. When ahit has occurred on a page in one of these seconds, the bit for thesecond is set to 1; otherwise, the bit has the value 0. Thus, as shownat 307, bits 4 and 69 are set to 1 because hits happened at seconds 5and 70. Bit set data item 307 is well suited for sets that are at leastpotentially dense, since the size of bit set data item 307 remains thesame regardless of the number of seconds in which hits occur. Anotheradvantage which stems from the fact that the size of bit set data item307 remains constant is that changes in the number of events do notcause changes in the amount of storage required for the roll up tableentry. This property is particularly important with events that mayoccur in bursts

Other representations of the list of time values are of course possible.For example, in a database system that permits nested tables, i.e., thevalue of a field of a record in a table may itself be a table, the hitscould be represented as a table of hits, with an entry for each hit.

Representing other Kinds of Information using Set Values

The time/date information from page hit table 101 is particularly easyto represent in set data items because the information for a given pageURL 103 forms a monotonically increasing set of values. However, otherkinds of values can be represented in set values as well. For example,the comma list can be used with values that do not increasemonotonically or for sets where the values are tuples instead of singlenumbers. For example, a weekly rollup might represent the hit times witha comma list like this: . . . ;day of week,second in day; . . . wherethe entries in the list are separated by semicolons and the tuple in theentry specifies the the day of the week and the second the hit occurredon that day. Comma lists with tuples can similarly be used to representsets of coordinates in two and three dimensions. Where nested tables arepossible, the nested tables could be used in place of the comma lists.Where a set never has two members with the same value and the number ofpossible values is finite, the set may be mapped to a bit set data itemas described above for time values. The bit set data item can of coursehave more than one dimension; thus, a bit set data item might be used inthe weekly rollup to represent a histogram of the hits that occurredduring the hours making up the week.

Using Rollups with Set Values for Events and Alerts

As already pointed out, an event is simply something that happens in asystem that has significance for the management of the system. A hit ona Web page is one example of an event in database system managementsystem 101. In the context of management service 113, some events aretermed alerts, and in the following, the terms “alert” and “event” areused interchangeably. When a series of events is being analyzed, it isoften crucial to know when the events in the series occurred;consequently, entries in tables that record occurrences of eventsgenerally include a field that indicates the date and time at which theevent occurred and it is important that when such event tables arerolled up, the temporal information be preserved. The techniques justdescribed may be used to do this. In the following, an example alerthistory roll up table that uses the techniques will be described indetail.

The Alert History Information

When an instance of an alert for which the history information is beingmaintained occurs, the management system stores information about thealert in an entry in a table called the mgmt_severity table. Theinformation in entries of that table has the following form: TargetNamestring [key] TargetType string [key] MetricName string [key]MetricColumn string [key] KeyValue string [key] CollectionTimestampdatetime [key] AlertState enum [key] MetricLabel string ColumnLabelstring Message string

TargetName is the name of the target 103 from which the alertinformation was obtained; MetricName is the name of a table for thetarget which contains the values being monitored; an example of such atable is shown in FIG. 4. MetricColumn is the column 405 in the table401 specified by MetricName that contains the metric values 411 beingrolled up. KeyValue is the value 409 of the entry's key in table 401.CollectionTimestamp is the timestamp for when the information from table401 was collected. AlertState is the severity of the alert; in apreferred environment, the possible severities are warning and critical.MetricLabel and ColumnLabel are labels used for reports generated fromthe entry. Message, finally, is a message that explains the alert.

An example of the above fields with the values obtained from table 401follows: TargetName string MySystem TargetType string Host MetricNamestring DiskInformation MetricColumn string SpaceUsed KeyValue string/prv CollectionTimestamp 01-MAY-2003 00:00:05 AlertState WarningMetricLabel Important Disk Information ColumnLabel Total Space UsedMessage Space Used Threshold exceededThe Alert History Table

The alert history table is a single table that contains aggregaterecords that represent daily and monthly rollups of records for aparticular kind of alert from a particular target system that have aparticular severity. The size of the records in the alert history isreduced by replacing information that is repeated in every alert historyrecord for a particular type of alert with a digest (for example, a hashvalue) made from that information. Here, two digests are made: one,termed InstanceId, from the fields TargetName, TargetType, MetricName,MetricColumn, KeyValue, MetricLabel, and ColumnLabel, for which thedigest will be the same for every alert of a given type coming from agiven target, and another, termed MessageId of the field Message, whichwill be the same for every alert of the given type. The digests are usednot only to represent the information they are made from in the alerthistory table, but also as indexes to locate the information they aremade from in tables in repository 115.

The entries that any alert history table made according to the examplewill have include the following fields: InstanceID [key] AlertState[key] AlertCount AlertOccurrences RollupTime [key] RollupIntervalMessageID

AlertState is the same as in the mgmt-severity table. AlertCount is thenumber of alerts represented by the aggregation entry; AlertOccurrencesis the value representing the set of times at which the alerts occurred;RollupTime is the beginning of the time period covered by the roll up.RollupInterval indicates the roll up's window, beginning at RollupTime.In the example, there are two windows: a day window and a month window,In the preferred embodiment, when a monthly rollup is made, the dailyrollups that it is based on are removed from the alert history table. Asindicated above, the record's key is made from the InstanceID,AlertState, and RollupTime fields, which together give a unique valuefor every record in the table. Additionally, the records for an alerthistory table for a given alert may contain aggregated metric values 411for the rollup period. In this case, the specified metric column isSpaceUsed, so the values being rolled up are the values in that column.The results of the roll up may include the average amount of space beingused when the alert was triggered during the roll up period, the minimumamount, and the maximum amount.

FIG. 5 shows how the parts of the record other than the metric valueslook before and after a monthly rollup. All of the records in the alerthistory table have the fields set forth above: InstanceId at 503,AlertState at 505, AlertCount at 507, AlertOccurrences, which contains aset of time values, at 509, RollupTime at 511, RollupInterval at 513,and MessageId at 515. At 501 are shown two daily roll up entries 501,501(i) and (j), as indicated by the values of field 513 in the records.Both records are for warnings (field 503). As indicated by RollupTimefield 511, 501(i)'s roll up window was May 1 and 501(j)'s roll up windowwas May 5. Both roll ups were made after the end of the window. Thevalue of fields 507 shows that there was only one warning on each ofthose days. Field 509 is a set value 519 that contains a set whosemembers are the times of each of the alerts that occurred during theroll up period. The set is represented as a comma list, with each timebeing given as year:day of year:second in the day. Since there were onlysingle alerts on May 1 and May 5, the sets in the entries 501 eachinclude only a single time. An advantage of the format used to representthe time is that it is uniform for all roll up intervals, which makes iteasy to apply pattern matching techniques to the format. For example,the pattern 3:121:* would match all warnings that occurred on May 1,2003.

Entry 517(k) is a monthly roll up entry in the alert history table forthe month of May. It is made beginning at midnight on June 1. Entry 517has the same form as entries 501, except that RollupInterval 513indicates that the roll up window is one month and set value field 519contains the times from the daily rollup entries that were rolled upinto the monthly roll up entry.

Uses of Alert History Tables with Sets of Values

Alert history tables in which alert counts and times of occurrences ofalerts are maintained simplify many different kinds of analysis. Thealert count makes it easy to determine the relative frequency of alerts,and thus the most prevalent problems in the system. Moreover, comparisonof the frequencies of a given kind of alert in rollups having differentroll up times but the same rollup interval will show whether a problemis getting more or less frequent.

Alert counts and alert times can also be used to determine whether oneevent is dependent on another. As individual alerts are rolled up overtime, heuristics can be used to identify strongly related events. Forexample, in the case where one event always causes a separate event totrigger, the alert count for the triggering event MUST be less than orequal to the dependent event. This property of related events can beused to reduce the search space required to identify the dependentevent.

A simple example of this use of alert counts is the following: Among theevents that generate alerts are host unavailability and databaseunavailability. One would expect that when the host is unavailable, thedatabase is also unavailable, and thus that two conditions should hold:

-   -   There should always be at least as many database unavailable        alerts as there are host unavailable alerts. (The database may        become unavailable for reasons other than the host becoming        unavailable.)    -   There should always be a database unavailable alert within close        temporal proximity of each host unavailable alert.

The first step in determining whether database unavailability isdependent on host unavailability is to look at several pairs of rollupentries for database unavailable alerts and host unavailable alertswhere the records in the pair are for the same rollup time and thelargest available rollup interval. If the number of databaseunavailability alerts in each of these pairs is greater than or equal tothe number of host unavailability alerts, there is a strong possibilitythat there is a dependency relationship between host unavailability anddatabase unavailability.

Once the possibility of a dependency relationship has been determined,the existence of the relationship can be confirmed by using theinformation that is contained in AlertOccurrences 509. If the dependencyexists, there should be a temporal relationship between each hostunavailable alert and a database unavailable alert. Whether this is infact the case can be confirmed by selecting a rollup entry for eachalert and comparing the sets of values in the AlertOccurrences fields.If it turns out further that there is a database unavailable alertcorresponding to each host unavailable alert, but the reverse is nottrue, then it is clear that database unavailability events are dependenton host unavailability events and not vice-versa.

Queries based on the kind of analysis just described can be used toautomatically identify less-obvious dependency relationships. Further,the alert occurrence information in the rollup entries can also be usedto determine whether events in one set of events have a termporalrelationship to events in another set of events, and this informationcan be used to identify relationships between events that are lessobvious and less strong than the simple dependency of one event onanother. Once a temporal relationship between events has beenidentified, the relationship can be used to filter event entries. Forexample, if a host unavailable alert always results in a databaseunavailable alert, the only database unavailable alerts that are reallyof interest to the DBA are those that are not caused by theunavailability of the host, and when a roll up is made for databaseunavailable alerts, only those database unavailable alerts that do nothappen shortly after a host unavailable alert might be included in theroll up.

Another use of such information about the relationship between events isto reduce the number of messages that are generated during a so-calledevent storm. For example, if it is known that when a host unavailableevent occurs, it causes an event storm, that is, the occurrence of alarge number of other events for which management agent 105 generatesmessages to management service 113, management agent 105 or managementservice 113 could be set up to provide the host unavailable eventmessage and suppress all of the messages for the events caused by thehost unavailable event. Suppressing such extraneous messages reduces theamount of information which must be stored in repository 115 and moreimportantly, makes it easier for the DBA at central console 121 or theDBA who is analyzing the messages in repository 115 to understand whatis really going on.

Conclusion

The foregoing Detailed Description has disclosed to those skilled in therelevant technologies how to make and use aggregated entries in databasetables that preserve sets of values obtained from the entries from whichthe aggregated entries were made and has further disclosed the best modepresently known to the inventors of implementing their invention. Itwill be immediately apparent to those skilled in the relevanttechnologies that there are many ways of implementing the techniques ofthe invention and that the particular characteristics of a givenimplementation will be strongly determined by the environment in whichthe implementation is made. For example, when the invention is used toextend an existing roll up system, many details of the implementationwill be determined by the existing roll up system. Further, there aremany ways in which a set of values can be represented, and theparticular representation chosen will be determined by factors such asthe properties of the set and the trade off between storage cost andcomputation cost offered by the system in which the techniques areemployed.

For all of the foregoing reasons, the Detailed Description is to beregarded as being in all respects exemplary and not restrictive, and thebreadth of the invention disclosed herein is to be determined not fromthe Detailed Description, but rather from the claims as interpreted withthe full breadth permitted by the patent laws.

1. A method of aggregating a plurality of entries in a table in adatabase management system into an aggregated entry, the methodcomprising the steps of: making the aggregated entry, the aggregatedentry representing the plurality of entries and including a field whosevalue is a representation of a set that may have a plurality of members;and deriving members of the set from values contained in entriesbelonging to the plurality thereof.
 2. The method set forth in claim 1further comprising the step of: deleting the plurality of entriesrepresented by the aggregated entry.
 3. The method set forth in claim 1wherein: the representation of the set has a size which varies with thenumber of members in the set.
 4. The method set forth in claim 3wherein: The representation of the set represents the set as a characterstring wherein each member is represented by a sequence of charactersand the sequences of characters are separated by a separator character.5. The method set forth in claim 1 wherein: the representation of theset has a size which is constant regardless of the number of members inthe set.
 6. The method set forth in claim 5 wherein: the representationof the set represents the set as a string of elements, there being anelement corresponding to each potential member of the set, the presenceof a particular member in the set being indicated by a first value ofthe corresponding element and the absence of the particular member beingindicated by a second value of the corresponding element.
 7. The methodset forth in claim 1 wherein: in the step of deriving members of theset, the values from which the members of the set are derived are timevalues.
 8. The method set forth in claim 1 wherein: in the step ofderiving members of the set, the values from which the members of theset are derived are location values.
 9. A method of rolling up eventinformation that is practiced in a management system for a databasemanagement system, the event information being contained in evententries in a table in the database management system and including atime of occurrence for each event and the method comprising the stepsof: making a roll up entry that represents a plurality of the evententries and includes a representation of a set whose members are timesof occurrences; and deriving the members of the set from the times ofoccurrences in the plurality of event entries.
 10. The method set forthin claim 9 wherein the roll up entry further includes an aggregatedmetric value and the method further comprises the step of: aggregatingmetric values in the plurality of event entries to produce theaggregated metric value.
 11. The method set forth in claim 9 wherein themethod further comprises the step of: deleting the plurality of evententries represented by the roll up entry.
 12. The method set forth inclaim 9 wherein: The representation of the set represents the set as acharacter string wherein each time of occurrence is represented by asequence of characters and the sequences of characters are separated bya first separator character.
 13. The method set forth in claim 12wherein: the sequence of characters represents the time of occurrence asthe sequence <year> second separator character <day_of_year> secondseparator character <second_in_day>.
 14. The method set forth in claim 9wherein the representation of the set has a first portion in the entrythat is used until no more members can be placed therein: and when nomore members can be placed therein, the method includes the step of:making a second portion of the representation in another table in thedatabase management system, whereby space is made for further members.15. The method set forth in claim 9 wherein: the plurality of evententries represented by the roll up entry have times of occurrences thatare within a period of time.
 16. The method set forth in claim 15wherein: the roll up entry includes a representation of the period oftime.
 17. The method set forth in claim 16 wherein: the representationof the period of time includes a representation of a time that is astart or end of the period of time and a representation of a length oftime.
 18. The method set forth in claim 9 wherein the roll up entryfurther includes a representation of the number of events represented bythe roll-up entry; and the method further comprises the step of:counting the number of events represented by the event entries to obtaina total number of events and setting the representation of the number ofevents to the total number of events.
 19. The method set forth in claim9 wherein the plurality of event entries have one or more fields whichhave the same values in each of the plurality of event entries; therollup entry includes a field which contains a digest of the values ofthe one or more fields; and the method includes the step of making thedigest from the one or more fields.
 20. The method set forth in claim 19wherein: the one or more fields specify a class of events to which theevent that is specified by each of the event entries belongs.
 21. Themethod set forth in claim 20 wherein: the one or more fields specify theclass of events by specifying the source of the event and a conditionthat caused the event.
 22. The method set forth in claim 19 wherein: thefield from which the digest is made is a message describing the event.23. A method of determining whether there is a relationship betweendifferent types of events in a database system that employs roll uptables whose entries represent events that occur over a period of timeand that further include sets of occurrence times during the period oftime, the method comprising the steps of: selecting a first roll uptable entry for a first type of event; selecting a second roll up tableentry for a second type of event that represents the same period of timeas the first roll up table entry; and determining whether there is atemporal relationship between at least some of the occurrence times inthe first roll up table's set of occurrence times and at least some ofthe occurrence times in the second roll up table's set of occurrencetimes.
 24. The method set forth in claim 23 wherein: the roll up tableentries further include a total number of occurrences value; and thefirst roll up table entry and the second roll up table entry areselected by comparing the total number of occurrences values todetermine whether there may be a relationship between the types ofevents represented by the first roll up table entry and the second rollup table entry.
 25. A data storage device, characterized in that: thedata storage device contains code which when executed by a processorperforms a method of aggregating a plurality of entries in a table in adatabase management system into an aggregated entry, the methodcomprising the steps of: making the aggregated entry, the aggregatedentry representing the plurality of entries and including a field whosevalue is a representation of a set that may have a plurality of members;and deriving members of the set from values contained in entriesbelonging to the plurality thereof.
 26. The data storage device setforth in claim 25 further characterized in that: the method furthercomprises the step of deleting the plurality of entries represented bythe aggregated entry.
 27. The data storage device set forth in claim 25further characterized in that: the representation of the set has a sizewhich varies with the number of members in the set.
 28. The data storagedevice set forth in claim 27 further characterized in that: Therepresentation of the set represents the set as a character stringwherein each member is represented by a sequence of characters and thesequences of characters are separated by a separator character.
 29. Thedata storage device set forth in claim 25 further characterized in that:the representation of the set has a size which is constant regardless ofthe number of members in the set.
 30. The data storage device set forthin claim 29 further characterized in that: the representation of the setrepresents the set as a string of elements, there being an elementcorresponding to each potential member of the set, the presence of aparticular member in the set being indicated by a first value of thecorresponding element and the absence of the particular member beingindicated by a second value of the corresponding element.
 31. The datastorage device set forth in claim 25 further characterized in that: inthe step of deriving members of the set, the values from which themembers of the set are derived are time values.
 32. The data storagedevice set forth in claim 25 further characterized in that: in the stepof deriving members of the set, the values from which the members of theset are derived are location values.
 33. A data storage device,characterized in that: the data storage device contains code which whenexecuted by a processor performs a method of rolling up eventinformation that is practiced in a management system for a databasemanagement system, the event information being contained in evententries in a table in the database management system and including atime of occurrence for each event and the method comprising the stepsof: making a roll up entry that represents a plurality of the evententries and includes a representation of a set whose members are timesof occurrences; and deriving the members of the set from the times ofoccurrences in the plurality of event entries.
 34. The data storagedevice set forth in claim 33 further characterized in that: the roll upentry further includes an aggregated metric value and the method furthercomprises the step of: aggregating metric values in the plurality ofevent entries to produce the aggregated metric value.
 35. The datastorage device set forth in claim 33 further characterized in that:deleting the plurality of event entries represented by the roll upentry.
 36. The data storage set forth in claim 33 further characterizedin that: The representation of the set represents the set as a characterstring wherein each time of occurrence is represented by a sequence ofcharacters and the sequences of characters are separated by a firstseparator character.
 37. The data storage device set forth in claim 36further characterized in that: the sequence of characters represents thetime of occurrence as the sequence <year> second separator character<day_of_year> second separator character <second_in_day>.
 38. The datastorage device set forth in claim 33 further characterized in that: therepresentation of the set has a first portion in the entry that is useduntil no more members can be placed therein: and when no more memberscan be placed therein, the method includes the step of: making a secondportion of the representation in another table in the databasemanagement system, whereby space is made for further members.
 39. Thedata storage device set forth in claim 33 further characterized in that:the plurality of event entries represented by the roll up entry havetimes of occurrences that are within a period of time.
 40. The datastorage device set forth in claim 39 further characterized in that: theroll up entry includes a representation of the period of time.
 41. Thedata storage device set forth in claim 40 further characterized in that:the representation of the period of time includes a representation of atime that is a start or end of the period of time and a representationof a length of time.
 42. The data storage device set forth in claim 33further characterized in that: the roll up entry further includes arepresentation of the number of events represented by the roll-up entry;and the method further comprises the step of: counting the number ofevents represented by the event entries to obtain a total number ofevents and setting the representation of the number of events to thetotal number of events.
 43. The data storage device set forth in claim33 further characterized in that: the plurality of event entries haveone or more fields which have the same values in each of the pluralityof event entries; the rollup entry includes a field which contains adigest of the values of the one or more fields; and the method includesthe step of making the digest from the one or more fields.
 44. The datastorage device set forth in claim 44 further characterized in that: theone or more fields specify a class of events to which the event that isspecified by each of the event entries belongs.
 45. The data storagedevice set forth in claim 45 further characterized in that: the one ormore fields specify the class of events by specifying the source of theevent and a condition that caused the event.
 46. The data storage deviceset forth in claim 44 further characterized in that: the field fromwhich the digest is made is a message describing the event.
 47. A datastorage device, characterized in that: the data storage device containscode which when executed by a processor performs a method of determiningwhether there is a relationship between different types of events in adatabase system that employs roll up tables whose entries representevents that occur over a period of time and that further include sets ofoccurrence times during the period of time, the method comprising thesteps of: selecting a first roll up table entry for a first type ofevent; selecting a second roll up table entry for a second type of eventthat represents the same period of time as the first roll up tableentry; and determining whether there is a temporal relationship betweenat least some of the occurrence times in the first roll up table's setof occurrence times and at least some of the occurrence times in thesecond roll up table's set of occurrence times.
 48. The data storagedevice set forth in claim 48 further characterized in that: the roll uptable entries further include a total number of occurrences value; andthe first roll up table entry and the second roll up table entry areselected by comparing the total number of occurrences values todetermine whether there may be a relationship between the types ofevents represented by the first roll up table entry and the second rollup table entry.